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I.   INTRODUCTION 

The  discrete  time  step  pursuer-evader  game  was  first 

described  by  Rufus  Isaacs  of  the  Rand  Corporation  in  the 

early  1950's  in  an  attempt  to  look  at  the  problem  of 

attacking  a  moving  target  who  is  maneuvering  so  as  to 

confound  the  prediction  of  his  future  position.   The  general 

problem,  as  described  by  Isaacs  is  as  follows: 

A  battleship  in  midocean  is  aware  of  an  enemy  bomber's 
presence,  but  the  plane  is  too  high  for  precise 
detection.   The  ship  is  interested  only  in  not  being 
hit;  it  has  no  offensive  means.   The  plane  has  one  bomb 
and  we  suppose--to  avoid  extraneous  f actors--that  the 
bomber's  aim  is  excellent.   The  battleship  knows  this, 
but  knows  nothing  about  when  or  where  the  bomb  will  be 
dropped  until  after  detonation.   It  is  to  maneuver  so 
as  to  minimize  the  hit  probability.  .  .  There  is  a  time 
lag  T  between  the  bomber's  last  sighting  of  the  ship  and 
detonation.   Thus  the  bomber  must  aim  at  an  anticipated 
position  of  the  ship  .  .  .  As  simple  as  this  problem 
sounds  circumstantially,  it  is  difficult  technically. 
To  gain  a  foothold,  we  simplified  it  further.   We  made 
the  ocean  one-dimensional  and  discrete.   That  is,  we 
supposed  the  battleship  to  be  located  on  one  of  a  long 
row  of  points  and  at  each  unit  of  time  he  hops  to  one 
adjoining  one,  enjoying  the  sole  choice  of  a  right  or 
left  jump.   The  time  lag  was  to  be  an  integral  number  n 
of  time  units,  or--the  same  thing--of  jumps.   This  is 
tantamount  to  saying  that  the  bomber  knows  all  positions 
of  the  battleship  which  precede  his  present  one  by  n 
jumps  or  more   Ref .CI] . 

The  solution  to  the  single  time  step  game,  (i.e.  n=1 )  is 
trivial  but  the  complexity  increases  greatly  as  the  time  lag 
or  number  of  time  steps  increases.   Isaacs,  upon  formulating 
the  game,  proposed  pursuer  and  evader  strategies  to  the  two- 
step  game,  however  the  proof  of  the  optimality  of  these 


strategies  is  highly  complex.   The  complexity  of  the  multiple 
step  games  arises  from  the  fact  that  the  evader  doesn't  know 
when  the  pursuer  will  attack;  if  he  did  it  would  be  an  easy 
matter  for  the  evader  to  distribute  himself  uniformly  over 
the  n+1  possible  positions  at  the  time  of  detonation,  and 
limit  the  pursuer  to  a  kill  probability  of  1/(n+l). 
Without  knowing  the  time  of  attack  the  evader  must  attempt 
to  make  his  position  uniform  at  every  time  step  and  this  is 
not  possible. 

The  three-step  pursuer-evader  game  is  yet  unsolved, 
however  near-optimal  strategies  for  both  the  pursuer  and 
evader  have  been  described.   The  best  existing  evader 
strategy,  developed  by  Joseph  3ram   Ref.r2],  involves  the 
evader  maintaining  an  infinite  memory  of  probabilities 
corresponding  to  the  probability  of  turning  given  the  evader 
has  not  turned  for  the  last  k  moves.   This  thesis  x^rill 
investigate  alternative  finite  evader  strategies  to  attempt 
to  lower  the  existing  upper  bound  on  the  three-step  game 
value  while  drastically  reducing  memory  req_uirements  and 
additionally  look  briefly  at  possible  evader  strategies  in 
the  four-step  game. 


II.   KNOM  SOLUTIONS  AND  STRATEGIES  F.OR  PURSUER-EVADER  GAMES 

A.   STRUCTURE 

For  uniformity,  the  convention  and  structure  described 
below  will  be  used  hereafter  in  the  description  of  all 
discrete  n-step  pursuer-evader  games.   The  pursuer  is  the 
maximizing  player  who  by  selection  of  time  of  fire  and  aim 
point  tries  to  maximize  the  probability  of  killing  the 
evader  (a  kill  is  achieved  when  the  pursuer  fires  at  the 
position  the  evader  subsequently  occupies  n  time  steps 
later).   The  evader  is  the  minimizing  player,  who  by  selec- 
tion of  maneuvers  along  the  discrete  linear  state  space, 
attempts  to  minimize  the  probability  of  being  killed.   The 
evader's  maneuvers  can  be  described  as  a  sequence  of  lefts 
and  rights  (L  and  R)  with  each  n-bit  sequence  of  L's  and 
R's  corresponding  to  one  of  the  n+1  final  positions 
achievable  in  n  steps  from  an  arbitrary  starting  position  as 
shown  in  Figure  2.1.   The  above-described  mapping  of  n-bit 
left-right  sequences  to  final  position  is  symmetric  under 
interchange  of  L's  and  R's  (i.e.  LLR  corresponds  to  a  sym- 
metric position  to  RRL  in  the  three-step  case).   Due  to  this 
symmetry  it  is  ea^uivalent  to  describe  the  evader's  maneuvers 
as  a  sequence  of  straights  and  turns  (S  and  T  which  provides 
an  equivalent  mapping  in  Figure  2.2.   A  turn  signifies  the 
evader  moves  in  the  opposite  direction  to  his  previous  move 
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Figure  2.1    Possible  Evader  Positions  in  n  Steps. 
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Figure  2.2 


Possible  Evader  Positions  in  terns  of 
Straights  aad  Turns. 
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and  a  straight  signifies  he  continues  in  the  same  direction 
as  his  previous  move.   Any  n-bit  sequence  of  lefts  and  rights 
can  be  translated  into  an  equivalent  (n-1)  bit  sequence  of 
straights  and  turns  (i.e.  LRRL  becomes  TST).   Note  that  in 
general  there  may  be  several  possible  sequences  of  turns  and 
straights  which  lead  to  the  same  final  position  (for  n=3. 
TST,  TTT,  and  STS  all  result  in  the  evader  occupying  the 
position  one  step  to  the  left  of  his  original  position) . 

B.   ONE- STEP  GAME 

The  single  step  pursuer-evader  game  has  a  simple 
solution.   With  only  one  time  step  elapsing  between  the 
pursuer's  time  of  fire  and  weapon  detonation  the  evader  can 
always  distribute  himself  uniformly  over  the  two  positions 
achievable  in  one  step  shown  in  Figure  2.3.   The  evader  on 
each  step  can  continue  straight  with  probability  (1-p)  or 
turn  with  probability  p.   Since  the  intelligent  pursuer  will 
limit  his  shot  to  one  of  the  two  feasible  positions  of  the 
evader  i-ihen   he  fires  (position  1  or  2  of  Figure  2.3).  the 
game  can  be  represented  graphically  as  shown  in  Figure  2./+. 
The  minimax  solution  occurs  when  p=0.5.   The  corresponding 
value  of  the  game  is  0.5.   The  optimal  evader  strategy  is  to 
fire  at  position  1  or  2  with  equal  probability. 


C.   TWO-STEP  GAME 

The  two-step  pursuer-evader  game  is  not  nearly  as  simple 


in  its  solution  as  the  one-step  game 


'he  solution  was 
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Figure  2.3    Achievable  Evader  Positions  in  One-Step  Game 
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Figure  2.4-    Graphical  Solution  to  the  One-Step  Game 
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found  by  starting  with  the  hypothesis  that  the  evader's 
maneuver  x-;ill  depend  only  on  his  previous  maneuver  and  none 
earlier;  thus  the  probability  of  continuing  in  the  sane 
direction  as  the  last  move  is  denoted  by  (l-p),  with  p  being 
"^he  probability  of  moving  in  the  opposite  direction  to  the 
previous  move.   The  attainable  positions  for  the  evader  and 
the  corresponding  probabilities  under  the  above  hypothesis 
are  shown  in  Figure  2.5.   The  pursuer  can  be  expected  to 
select  the  position  ( 1 ,  2  or  3)  with  the  highest  associated 
probability.   The  evader  will  select  p  so  as  to  minimize 
this  maximum  probability.   The  optimal  value  of  p  is  then 
found  by  solving: 


min   [  MAX    (p-pS  p,  (1-p)M] 


s  .  t .  0£p<.1  .  0 


Graphically  the  solution  is  shown  in  Figure  2.6.   The 
resulting  solution  is  found  by  solving  the  quadratic  p=(1-p)^ 
which  has  a  root  at  p=(3-/5)/2  =  0.38197  .  .  .  ;  this  value 
is  also  the  probability  that  the  evader  is  in  position  2  or 
3  of  Figure  2.5  and  thus  the  value  of  the  game.   The  proof 
that  this  evader  strategy  is  optimal  and  that  (3-/5) /2  is 
the  value  of  the  game  is  complex.   Three  different  proofs  are 
given  by  Dubins   Ref.[3]  ,  Isaacs   Ref.^/^]  and  Ferguson 

Ref.r5j.   The  pursuer  strategies  in  the  multi-step  games 
are  characterized  by  the  non-existence  of  an  optimal 
strategy;  the  pursuer  can  always  increase  his  expected 
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Figure  2.5    Achievable  Evader  Positions  in  Two-Step  G 
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Figure  2.6    Graphical  Solution  to  the  Two-Step  Game. 
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kill  probability  by  v;aiting  a  few  more  time  periods  but  he 
cannot  wait  indefinitely  to  fire  or  his  payoff  is  zero. 

This  contradiction  leads  to  strategies  for  the  pursuer  v;hich 

have  payoffs  arbitrarily  close  to,  but  not  equal  to,  the 

value  of  the  game,   Ferguson  developed  such  a  pursuer 

strategy  which  confirmed  that  (3-/5')/2  =  0.38197  .  .  .  was 
the  value  of  the  two-step  game. 

D.   THREE- STEP  GAME 

As  stated  earlier  the  three-step  pursuer-evader  game  is 
yet  unsolved.   The  value  of  the  three-step  game  has  been 
bounded  to: 

0.28A23  <  V  <  0.28903 

by  Bram.   This  section  will  investigate  previous  near- 
optimal  evader  strategies  for  the  three-step  game  and  the 
resulting  upper  bounds  uoon  the  game  value. 

oil  -L  o 

1 ,   Markov  Rvnothesis  Strategv 

The  Markov  Hypothesis  for  the  n-step  pursuer-evader 
game  is  stated  as  follows:   the  probability  that  the  evader 
will  go  left  or  right  (or,  straight  or  turn)  is  dependent  on 
the  previous  n-1  moves  but  not  on  any  moves  further  in  the 
past  than  the  n-lst.   This  form  or  evader  strategy  makes 
intuitive  sense  since  it  does  not  seem  likely  that  an 

optimal  evader  strategy  will  depend  upon  information  which 
the  pursuer  already  knows  at  the  time  of  fire.   The  known 
optimal  strategies  for  the  one  and  two-step  games  adhere  to 
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the  Markov  Hypothesis.   In  the  one-step  game  the  optimal 
evader  turns  or  continues  straight  with  equal  probability, 
therefore  independent  of  all  previous  moves.   (i.e.  P(3)  = 
P(T)  =  P(L)  =  P(R)).   In  the  two-step  game  the  optimal 
evader  uses  a  strategy  where  the  probability  of  turning  (or 
continuing  straight)  depends  only  upon  his  previous  move 
(i.e.  P(3)  =  P(L|L)  =  P(R|R)  =  0.61803  and  P(T)  =  P(L|R)  = 
P(R|L)  =  0.38197) . 

The  Markov  Hypothesis  will  now  be  applied  to  the 
three-step  game.   Since  the  evader  will  condition  his  next 
move  upon  his  previous  two  moves,  his  strategy  can  be 
described  by  a  2x2  transition  matrix  as  shown  in  Figure  2.7. 
The  state  of  the  evader  a'  any  time  is  3  or  T  since  this 
state  is  a  function  of  the  evader's  last  two  moves  (i.e.  LL 
or  RR->3)  .   In  this  transition  matrix: 

g   =  P(Next  state  is  S  !  Last  state  was  3) 
q   =  P(Next  state  is  3  I  Last  state  was  T). 

The  four  achievable  positions  for  the  evader  in  the  three- 
step  game  and  the  associated  maneuver  sequences  are  shown  in 
Figure  2.8.   Let  the  variable  W  represent  the  final  position 
of  the  evader  three  steps  after  the  time  of  fire;  from 
Figure  2.8  it  can  be  seen  We  ( 1  ,  2  ,  3  »i4)  .   Let  the  variable 
STATS  represent  the  state  (3  or  T)  that  the  evader  occupies 
at  the  time  of  fire.   The  probability  that  the  evader 

occupies  any  final  position  is  a  function  of  q   and  q   when 
*   .  12 
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Figure  2.7 


Markov  Hypothesis  Transition  Matrix  for 
Three-Step  Game. 


Figure  2.8    Achievable  Evader  Positions  in  Three-Step  Game 
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conditioned  upon  his  initial  state.   For  example,  given 
STATE=S,  to  arrive  at  ¥=1,  the  sequence  of  transitions  under- 
gone must  be: 

S  to  T  to  S  to  S 


The  probability  of  this  occurrence  can  be  written: 


P(VJ=1  |STATE  =  S)  =  (1-q^  )q2q-, 


The  remaining  seven  conditional  probabilities  are 


P(¥=2 
P(W=3 

P(W=1 
P(W=2 
P(W=3 


STATE  =  S)  =  (1-q^  )  q2  ( ^ -q-,  )  +  ( ^ -^i  )(^-q2^^^^1  ^^"^1  ^^^2 

sTATs=s)  =  (i-q^)(i-q2)q2+q-,  ^ -q-, )  ( ^ -q2 ) ^q-] '  ( ^ -q-] ) 

STATS  =  5)=q^  ^ 

STATE  =  T)  =  (1-q2)q2q-, 

STATE  =  T)  =  (l-q2)q2(1-q-i)  +  (l-q2)'+q2^^-^1  ^^2 

STATE  =  T)  =  (1-q2)'q2  +  q2(^-qi)(^-q2^^'^2^l^'^'^1^ 


P(W  =  ^|STATS  =  T)=q2q/ 

At  any  time  the  pursuer  may  choose  to  fire,  he  knows 
which  of  the  two  states  (S  or  T)  that  the  evader  is  in  by 
observing  his  last  two  moves.   The  optimal  values  of  q^  and 
q^  under  this  strategy  are  found  by  solving  the  following 
non-linear  Droblem: 


min   I"  MAX    {  P  (w=  j  ISTATE  =  i)  }~| 


^1^2 


i»J 


j=1  .2,3,^ 
i  =  S,T 
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The  solution,  due  to  Ferguson,  is  q^  =  0.63397.  .  . ,  q^  = 
0.73205.  ,  ,  with  a  corresponding  game  value  of  0.294-23,  the 
resulting  matrix  of  conditional  probabilities  is  shown  in 
Table  I.   Ferguson  states  when  presenting  this  evader 
strategy,  that  it  is  not  known  to  be  optimal  and  in  fact  he 
conjectures  that  no  evader  strategy  of  finite  dependence  is 
optimal  for  the  evader.   The  strategy  of  Eram  presented  in 
the  next  section  will  show  that  indeed  an  evader  strategy  of 
infinite  dependence  does  result  in  a  tighter  bound  on  the 
game  value, 

2.   Infinite  Dependence  Strategy 

As  mentioned  in  Chapter  I,  the  best  existing  evader 
strategy  for  the  three-step  game  was  described  by  Joseph 
Bram.   This  strategy  can  be  described  as  an  infinite  sequence 
of  the  conditional  probabilities  that  the  evader  will  con- 
tinue straight  given  the  state  S  of  his  previous  moves.   If 
the  previous  move  by  the  evader  was  a  turn,  the  evader  is  in 
state  S=1 ,  while  if  the  previous  k-1  moves  have  been  straight 
the  evader  is  in  state  S=k.   (Note  that  the  state  space  of  S 
is  infinite).   We  will  denote  a  turn  by  T  and  a  straight  by 
S  as  before.   At  each  time  step  the  evader  continues  straight 
or  turns  with  a  probability  dependent  upon  his  state  S.   Let: 


p,  =  P(StraightlS=k) . 


If  the  evader  is  in  state  k  at  some  time  n,  at  time  n+3  the 
evader  can  be  in  one  of  four  positions  described  by  W 
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TABLE  I 
P(W=W! STATE)  for  Three-Step  Markov  Hypothesis  Strategy 

q^  =  F(S|S)  =  0.63397 
q^  =  P(S  |T)  =  0.73205 

W=         1         2         3         ^ 

STATE 

S  .16987  .29^23  .28109  .25^80 

T  .12^35  .28719  .29^23  .29^23 
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previously.   There  are  eight  possible  3-bit  sequences  of  S's 
and  T*s  which  correspond  to  the  four  possible  terminal 
positions  as  shown  in  Figure  2.8.   The  probabilities  associa- 
ted with  each  position  \'J   given  k  are  as  follows: 

P(¥=llS=k)=(1-p^)p^P2 

P(W=2|S=k)  =  (1-p,^)p^(1-P2)  +  (1-p^)(1-p^)'  +  PiJ1-p^^^)p^ 

P(W=3!S=k)  =  (1-p,^)(1-p^)p^+p^(1-p^^^)(l-p^)+p.^p^^^(1-p^^2) 
P(W=4|S=k)=p^p^^^p^^2 

If  the  evader  fires  at  time  n,  at  position  \I ,    when  S=k,  his 
expected  payoff  will  be: 

p(w=y!s=k) 

The  upper  bound  on  the  value  of  the  game  played  with  this 
strategy  is: 

MAX    MAX    {P(W=W|3=k)} 

A 

k     ¥ 

The  evader  of  course  will  attempt  to  select  his  infinite 

array  of  P,  ' s  so  as  to  minimize  the  above  bound  which  is  the 
'^  k 

maximum  payoff  that  the  pursuer  can  achieve.   The  best  set 
of  Pv ' s  found  by  Bram  is  delineated  in  Table  II,  while  the 

A 

resulting  P(W=W|S=k)  is  shown  in  Table  III.   The  upper 
bound  on  the  game  value  under  this  specific  set  of  Pi^ '  s  is 
the  maximum  value  found  in  Table  III  or  0.28903.   In  this 
strategy  the  decision  to  turn  or  continue  straight  has  a 
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1 

2 
3 

5 

6 

7 

8 

9 

10 

11 

12 

13 


TABLE  II 
A  Safe  Set  of  p.  ' s  for  the  Evader 


1  .69290 

2  .62^67 

3  .66775 
^  .65137 

5  .662^1 

6  .65859 

7  .66135 

8  .660^7 

9  .66116 

10  .66096 

11  .661U 

12  .66109 

13  .66114. 
U  .66113 
15  .661U 


TABLE  III 

A 

P(W=W|S=k)  using  p.  's  of  Table  II 


W=         1 


13292 

.28903 

.28903 

.28903 

162^^6 

.27682 

.28903 

.27170 

U381 

.27905 

.28903 

.28818 

15090 

.27591 

.28903 

.28417 

U612 

.2763^ 

.28903 

.28852 

U778 

.27552 

.28903 

.28768 

U658 

.27560 

.28903 

.28880 

U696 

.27539 

.28903 

.28863 

U666 

■  .27539 

.28903 

.28892 

U675 

.27534 

.28903 

.28889 

U667 

.27534 

.28903 

.28896 

U669 

.27532 

.28903 

.28896 

U667 

.27532 

.28903 

.28898 
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dependence  upon  the  previous  moves.   That  dependence  may 
extend  infinitely  far  back;  thus  the  evader  is  required  to 
maintain  the  infinite  array  of  Pi^'s  to  execute  this  near- 
optimal  strategy. 

3 .   Sub-Markov  Strategy 

The  strategy  presented  here  is  due  to  Bouchoux 
Ref.|_6j  and  is  characterized  by  a  strategy  where  the  evader's 
sequence  of  moves  is  not  Markovian  in  itself  but  one  in 
which  that  sequence  is  generated  by  a  substructure  which  is 
Markovian,  hence  the  description  Sub-Markov.   This  form  of 
strategy  is  suggested  by  its  use  in  providing  optimal 
strategies  in  emission-prediction  games  described  by 
Blackwell   Ref.L?]  and  Matula   Ref.CsJ.   The  pursuer-evader 
game,  while  similar  to  emission-prediction  games,  is  compli- 
cated by  the  fact  that  there  are  several  distinct  sequences 
of  moves  which  lead  to  the  possible  terminal  positions. 
Since  the  pursuer  (predictor)  must  fire  at  one  of  those  ter- 
minal points  and  not  at  a  specific  sequence  of  moves,  the 
game  is  more  complex.   Bouchoux  describes  a  strategy  based 
upon  three  states.  A,  B  and  C,  through  which  the  evader 
transitions  in  a  Markovian  manner.   When  in  state  A  the 
evader  always  turns,  while  in  states  B  and  C  he  always  goes 
straight.   After  each  move,  straight  or  turn,  the  evader 
transitions  between  states  according  to  a  3x3  transition 
matrix  and  is  ready  for  his  next  move.   This  strategy  is 
finite  in  the  memory  required  by  the  evader  and  Bouchoux 
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obtained  a  bound  on  the  game  value  of  0.28922  by  optimizing 
upon  the  transition  matrix. 
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III.   EXTENDED  MARKOV  STRATEGY 

A.   MOTIVATION  AND  DESCRIPTION 

The  evader  strategy  to  be  investigated  will  be  called 
Extended  Markov  because  it  is  an  extension  of  the  finite 
dependence  of  the  Markov  Hypothesis  strategy.   The  depen- 
dence will  be  finite  but  will  extend  beyond  the  previous  n-1 
steps.   In  the  Markov  Hypothesis  strategy,  for  the  three- 
step  game,  discussed  in  II.D.1.,  the  best  strategy  for  the 
evader  resulted  in  an  upper  bound  on  the  game  value  of 
0.294.23.   If  the  dependence  is  restricted  to  only  the  pre- 
vious move  instead  of  the  previous  two  moves  the  best 
strategy  results  in  an  upper  bound  of  0.29630  (Note:   this 
is  equivalent  to  adding  the  constraint  q^ =q   to  the  non- 
linear problem  described  in  II.D.1.  with  a  solution  at 
q^=qp=2/3).   Since  Bram's  strategy  showed  that  the  Markov 
Hypothesis  was  not  optimal  for  the  three-step  game,  it  seems 
that  a  Markovian  strategy  where  the  dependence  is  finite  but 
extends  beyond  the  last  n-1  moves  might  result  in  a  tighter 
bound  on  the  game  value  than  previously  obtained.   This  is 
the  class  of  strategies  to  be  called  Extended  Markov.   These 
strategies  for  the  three-step  game,  Markovian  in  nature, 
will  arise  from  a  dependence  upon  the  last  three  or  more 
moves  and  will  be  called  the  n-dependent  strategies  where  n 
represents  the  level  of  dependence.   In  this  context,  the 
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Markov  Hypothesis  strategy  for  the  three-step  game  is  the 
two-deDendent  strategy. 


•to. 


B.   GENERAL  N-DEPENDENT  STRATEGY 

In  the  n-dependent  strategy  the  evader  will  determine 
his  next  move  based  upon  his  previous  n  moves.   The  evader 
can  be  thought  of  as  controlling  2   variables,  each  being 
the  probability  of  going  (say)  right  given  the  previous  n 
steps  have  been  in  a  certain  sequence.   We  will  utilize  the 
left-right  symmetry  of  the  problem  by  considering  only  paths 
where  the  last  move  is  to  the  (say)  right,  resulting  in  only 
2  ~   variables,  each  representing  the  probability  of  going 
(say)  straight  given  the  last  n  steps  have  produced  a 
certain  n-1  bit  sequence  of  straights  and  turns.   The  general 
n-dependent  strategy  can  be  described  by  a  Markov  chain 
having  2  ~   states  corresponding  to  the  2     different 
n-1  bit  sequences  of  straights  and  turns  which  are  possible 
based  on  the  last  n  moves  (i.e.   conditioning  upon  the  last 

n  moves  is  equivalent  to  conditioning  on  the  last  n-1 

n-1 
straights  or  turns).   From  each  of  the  2  ~   states  there  is 

a  fixed  probability  that  the  evader  will  maneuver  to  one  of 

the  four  final  positions  ¥  in  the  next  three  steps.   A  2  ~ 

n-1 
x  2     transition  matrix  will  be  used  to  describe  the  condi- 
tional probability  of  turning  or  continuing  straight  given 
the  current  state  ((n-l)-bit  sequence).   Since  the  state 
describes  the  previous  n  moves  in  terms  of  straights  and 
turns  only  two  possible  transitions  exist  from  each  of  the 
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states.   The  first  n-2  bits  of  the  state  transitioned  to  are 
determined  by  the  last  n-2  bits  of  the  state  transitioned 
from;  the  last  bit  will  be  S  or  T  depending  upon  the  new 
move.   Due  to  this  structure  the  transition  matrix  will  be 
completely  defined  by  2  ~   variables  (called  q.  i=1 ,  2    ) 
which  represent  the  probability  of  continuing  straight  given 
the  current  state.   The  other  transition  probability  for 
that  state  will  obviously  be  (l-q.).   Using  a  transition 
matrix  so  constructed,  the  conditional  probability  of 
ending  in  one  of  the  four  final  positions  (V/=1,2,3  or  4-)  can 
be  found.   In  order  to  arrive  in  position  1,  for  example,  the 
sequence  of  states  transitioned  must  result  in  the  termina- 
ting three-bit  sequence,  TSS,  as  can  be  seen  from  Figure  2.8. 
Thus  P(W=Wi STATE)  is  a  function  of  the  variables  q.  (i=1, 
2  ~  )  and  the  best  n-dependent  strategy  is  solved  by  the 
following  non-linear  program: 


mm 

^i 

s.  t . 


[  MAX 
W,  STATE 
0<q.<1  .0 


P(W=V/|STAT 


i  =  1  ,2 


IE)] 


For  general  n,  it  is  seen  that  the  above  program  involves 
minimizing  the  maximum  of  2     (4.  positions  x  2^~   states) 
non-linear  functions  of  up  to  2  ~   variables.   No  analytic 
solution  has  been  found  and  in  later  sections  near-optimal 
solutions  will  be  found  by  non-linear  search  techniques. 
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C.   THREE-DEPENDENT  STRATEGY 

The  first  extension  of  the  Markov  Hypothesis  strategy  is 
the  three-dependent  strategy  described  by  four  states  (S3,  ST, 
TS,  TT)  and  a  4.x4.  transition  matrix  shown  in  Figure  3.1 
where : 

q^  =  P(next  move  is  straight  [  State  is  SS) 

or  equivalently ; 

q^  =  P(next  state  is  SS  |  last  state  was  SS) 

The  sixteen  conditional  probabilities  of  terminating  in  one 
of  the  four  positions  W,  given  the  evader  starts  from  one  of 
the  four  states  are  listed  in  Table  IV.   The  best  solution 
found  using  the  three-dependent  strategy  gives  an  upper 
bound  on  the  game  value  of  0.28964-  when: 

q^  =  0.66163  q3  =  0.624.89 

q^  =  0.7005A  q,  =  0.700$/; 

The  matrix  of  conditional  probabilities  evaluated  at  this 
point  are  in  Table  V.   This  solution  was  found  by  utilizing 
an  improved  feasible  direction  search  which  was  started  from 
a  known  "good"  solution.   For  the  three-dependent  strategy  a 
good  starting  point  is  found  by  applying  the  known  two- 
dependent  (Markov  Hypothesis)  solution  to  the  three- 
dependent  structure.   If  one  applies  the  restriction  q^ =q 
and  ^.n'^i    "t°  "t^®  three-dependent  strategy,  it  is  equivalent 
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NEXT  STATE 

SS 

ST 

TS 

TT 

SS 

^^ 

1-q, 

0 

0 

LAST 

ST 

0 

0 

^2 

l-q. 

STATE 

TS 

^3 

l-qj 

0 

0 

TT 

0 

0 

q/ 

^-q/ 

Figure  3.1 


ixU   Transition  Matrix  for  3-'Dependent 
Strategy . 
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TABLE  IV 

A 

P(W=W| STATE)  for  3-Dependent  Strategy 


Notation:   p.  =  1-q.     i=1,2,3»^ 


(W=1 

SS) 

(W  =  2 

SS) 

(W  =  3 

SS) 

(w=^ 

SS) 

(W  =  1 

ST) 

(W=2 

ST) 

(W=3 

ST) 

(W=4 

ST) 

(W=1 

TS) 

(W=2 

TS) 

(W=3 

TS) 

(W=^ 

TS) 

(W=1 

TT) 

(W  =  2 

TT) 

(W=3 

TT) 

(W=^ 

TT) 

=  Piq2^3 

=  PTq2P3  ^  ^1^2^^  ^  ^lPl^2 

=  PlPa'^^  "^  ^lPlP2  ^  ^I'^iPi 

=  P2^^^3 

=  P2^aP3  ^  P2P^P4  ^  ^2P3^2 

=  p^p^q^  +  q2P3P2  +  q2q3Pl 

=  q2q3qi 

=  P3q2^3 

=  p^q2P3  +  P3P2P^  +  ^3Pl^2 

=  P3P2q^  +  q3P^P2  +  q3qTP^ 

=  q3q^q-^ 

^  P^^4^3 

=  p^p^q^  +  q^P3p2  +  ^^^3^1 

=  ^^^3^1 
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TABLE  V 
Good  Evader  Strategy  in  3-Dependent  Case 


W= 

STATE 

SS 
ST 
TS 
TT 


^1 


U812 
13109 
164-21 
13109 


P(S 
P(S 
P(S 
P(S 


SS) 
ST) 
TS) 
TT) 


0.66163 
0.7005A 
0.62^8-9 
0.7005/i 


P(W=W|STATE) 
2 


27609 
28964- 
28033 
2896A 


.28615 
.2896^ 
.28191 
.2896/^ 


k 


2896^ 
2896^ 
27355 
2896^ 


TABLE  VI 
Good  Evader  Strategy  in  4--Dependent  Case 


^1 
^2 
^^3 


P(S 
P{S 
P(S 
P(S 


sss 

SST 
STS 
STT 
TSS 
TST 
TTS 
TTT 


SSS) 
SST) 
STS) 
STT) 


W=  1 

STATE 


.U809 
.13224- 
.16312 
.1322:^ 
.U5A3 
.1322^ 
.16312 
.1322^ 


0.65931 
0.69579 
0.624.74- 
0.69579 


^6 

^7 


5  : 


P(W=W|STATE) 
2 


.27677 
.28925 
.278U 
.28925 
.27606 
.28925 
.278U 
.28925 


P(S 
P(S 
P(S 
P(S 


TSS) 
TST) 
TTS) 
TTT) 


.28854- 
.28925 
.28^65 
.28925 
.28925 
.28925 
.28^65 
.28925 


0.6654-3 
0.69579 
0.62.^7^ 
0.69579 


k 


28659 

28925 
27409 
28925 
28925 
28925 
27^09 
28925 
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to  the  strategy  discussed  in  II. D.I,  with  an  upper  bound 
of  0.29423  when: 

q^  =  q   =  0.63397        *^2  "  ^4  "  0.73205 

Analogously  any  near-optimal  solution  to  the  n-dependent 
strategy  will  provide  a  "good"  initial  solution  to  the 
(n+1 ) -dependent  strategy.   While  the  solution  given  above 
for  the  three-dependent  strategy  is  not  known  to  be  optimal, 
but  rather  a  local  minimum  of  the  problem  described  in 
III.B.,  it  does  represent  a  significant  improvement  over  the 
two-dependent  strategy  (0. 29-^23)  and  is  close  in  value  to 
the  infinite  strategy  of  Brara  (0.28903).   Appendix  A  pre- 
sents an  analysis  of  the  above  three-dependent  solution  and 
shows  that  the  proposed  solution  does  satisfy  first-order 
Kuhn-Tucker  conditions  (necessary  but  not  sufficient)  for  a 
global  minimum.   It  is  interesting  to  note  that  in  the 
proposed  solution  q^"^/  °^* 

P(S|ST)  =  P(S|TT). 

Additionally  in  order  for  the  pursuer  to  receive  his  maximum 
achievable  payoff  he  must  refrain  from  attacking  when  the 
state  is  TS  or  be  limited  to  a  payoff  of  0.28191. 

D.   FOUR  AND  FIVE-DEPENDENT  STRATEGIES 

The  treatment  of  the  four-dependent  and  five-dependent 
strategies  is  equivalent  to  the  previously  described  three- 
dependent  strategy  with  the  expansion  of  the  state  space  and 
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number  of  variables  involved  to.  eight  .and  sixteen 
respectively.   Good  solutions  to  the  four  and  five-dependent 
strategies  were  found,  as  in  the  three-dependent  case,  by 
starting  at  a  known  near-optimal  set  of  values  for  the  q. 's 
and  conducting  an  improving  feasible  direction  search  until 
a  local  minimum  was  found.   The  best  solutions  thus  found  to 
the  four  and  five-dependent  strategies  and  the  resulting 
conditional  probability  matricies  are  shown  in  Tables  VI  and 
VII. 

E.   CHARACTERISTICS  OF  THREE,  FOUR  AND  FIVE-DEPENDENT 

STRATEGIES 

The  solutions  found  for  the  three,  four  and  five- 
dependent  strategies,  outlined  in  Tables  V,  VI  and  VII  show 
several  revealing  characteristics.   In  each  case  the  condi- 
tional probability  of  continuing  straight  given  the  n-1  bit 
state  is  not  dependent  upon  all  of  the  information  contained 
in  that  n-1  bit  sequence.   The  probabilities  are  dependent 
only  upon  the  number  of  time  steps  elapsed  since  the  last 
turn  maneuver  and  not  upon  any  turn-straight  information 
further  in  the  past  than  that  last  turn.   For  example, 
letting  t  denote  the  number  of  time  steps  since  the  last 
turn,  then  in  the  five-dependent  solution: 

q3=q7=qi-i=q^5  =  P(s|t=2) 

q5=q-i3  =  p(s[t=3) 


33 


TABLE  VII 
Good  Evader  Strategy  in  5-Dependent  Case 


^1 

q 


3  _ 

■A  : 


56 


5  _ 


W= 
STATE 

SSSS 
SSST 
SSTS 
SSTT 
STSS 
STST 
STTS 
STTT 
TSSS 
TSST 
TSTS 
TSTT 
TTSS 
TTST 
TTTS 
TTTT 


P(S 
P(S 
P(S 
P(S 
P(S 
P(S 
P(S 
P(S 


SSSS) 
SSST) 
SSTS) 
SSTT) 
STSS) 
STST) 
STTS) 
STTT) 


0.66120 
0.69385 
0.62^70 
0.69385 
0.66698 
0.69385 
0.62^70 
0.69385 


^10 

^12 
^-13 

q 


15  _ 


^16 


.U685 

.13270 
.16267 
.13270 
.U^35 
.13270 
.16267 
.13270 
.15156 
.13270 
.16267 
.13270 
.U^35 
.13270 
.16267 
.13270 


P(W=W|STATE) 
2 


.275^1 
.28910 
.28569 
.28910 
.27975 
.28910 
.28569 
.28910 
.27670 
.28910 
.28569 
.28910 
.27975 
.28910 
.28569 
.28910 


P(S 
P(S 
P(S 
P(S 
P(S 
P(S 
P(S 
P(S 


TSSS) 
TSST) 
TSTS) 
TSTT) 
TTSS) 
TTST) 
TTTS) 
TTTT) 


.28867 

.28910 
.28910 
.28910 
.28910 
.28910 
.28910 
.28910 
.28742 
.28910 
.28910 
.28910 
.28910 
.28910 
.28910 
.28910 


0.65034 
0.69385 
0.62470 
0.69385 
0.66698 
0.69385 
0.62470 
0.69385 


4 


.28907 
.28910 
.27097 
.28910 
.28680 
.28910 
.27097 
.28910 
.28432 
.28910 
.27097 
.28910 
.28680 
.28910 
.27097 
.28910 
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^9 
^1 


=  P(Slt  =  4.) 
=  P(Slt>/i) 


It  is  hypothesized  that  this  characteristic  holds  for  the 
optimal  form  of  any  n-dependent  strategy.   If  this  is  so  it 
can  be  seen  that  the  n-dependent  strategy  is  a  finite  (trun- 
cated) version  of  the  Bram  strategy  presented  in  II. D. 2.  and 
as  the  level  of  dependence  n  is  increased  without  bound  the 
bound  of  0.28903  of  Bram  is  expected  to  hold. 

Each  of  the  investigated  strategies  is  also  characterized 
by  having  some  states  in  which  the  evader  must  refrain  from 
firing,  else  he  forfeits  his  ability  to  maximize  his  payoff. 
As  the  level  of  dependence  increases  however,  the  penalty  to 
the  pursuer  who  fires  when  the  evader  is  in  one  of  these 
states  diminishes.   Table  III*  shows  that  under  Bram ' s 
strategy  there  is  no  time  at  which  the  pursuer  cannot 
achieve  his  maximum  payoff  given  he  always  fires  at  position 
W  =  3. 
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IV.   FOUR-STEP  GAME 

The  four-step  pursuer-evader  game  has  been  the  subject 
of  little  interest  due  to  the  unsolved  nature  of  the  three- 
step  game.   We  shall  briefly  look  at  the  four-step  game  and 
discover  that  the  apparent  characteristic  structure  of  the 
three-step  extended  Markov  strategies  does  not  extend  to  the 
four-step  game.   Given  a  four-step  time  delay  between  the 
attacker's  time  of  fire  and  subsequent  detonation,  the  evader 
may  achieve  five  different  positions  through  the  sixteen 
different  four-bit  sequences  of  turns  and  straights  as  shoxv/n 
in  Figure  4- •  1  .   The  Markov  Hypothesis  strategy  solution  to 
the  four-step  game  is  due  to  Washburn   Ref.l_9].   In  the  four- 
step  game  the  Markov  Hypothesis  has  dependence  extending  to 
the  last  three  moves,  the  best  strategy  under  this  hypothesis 
bounds  the  value  of  the  game  to  0.237^0  or  below,  the  q 
values  and  resulting  conditional  probability  matrix  is  shown 
in  Table  VIII.   The  first  extended  Markov  strategy  of  the 
four-step  game,  the  only  one  investigated,  is  the  four- 
dependent  strategy;  in  this  strategy  dependence  reaches  back 
to  the  last  four  moves.   The  best  solution  found  using  the 
four-dependent  strategy  is  shown  in  Table  IX  and  provides  an 
upper  bound  of  0.23734-.   While  this  is  an  improvement  over 
the  Markov  Hypothesis  solution  of  '.'ashburn,  the  improvement 
is  very  slight.   Additionally,  no  underlying  characteristic 
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such  as  discussed  in  III.E.  for  the  three-step  extended 
Markov  strategies  is  apparent  from  the  three  and  four- 
depend'ent  strategies  inv-estigated  for  the  four-step  game. 
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Figure  lr.^         Achievable  Evader  Positions  in  Four-Step  Game 


TABLE  VIII 

Markov-Hyi 

DO 

thesis  St 

.ratea;y  for 

Four-Step 

Game 

^1  = 
^2  ~- 

0 
0 

.69681 
.69681 

n3 

=  0 
=  0 

.70169 
.69675 

P(W= 

W  STATE) 

A 

STATE 

1 

2 

3 

k 

5 

SS 

.10330 

.18677 

.23739 

.23678 

.23575 

ST 

.10329 

.18511 

.23709 

.23710 

.237^0 

TS 

.10163 

.18615 

.237^0 

.237^0 

.237^0 

TT 

.10331 

.18512 

.23709 

.23710 

.23738 

TABLE  IX 
Three-Dependent  Strategy  to  Four-Step  Game 


q.  =  0.6972^  q   =  0.69728 

q'  =  0.69727  q?  =  0.69727 

q^  =  0.70A66  q^  =  0.70^69 

q^  =  0.6965A  qg  =  0.6972^ 

P(W=W|STATE) 


TATE 

1 

2 

3 

k 

5 

SSS 

.10306 

.18769 

.2362A 

.23668 

.2363^ 

SST 

.1029^ 

.18508 

.23733 

.23731 

.23733 

STS 

.10053 

.18828 

.2365^ 

.23733 

.23732 

STT 

.10329 

.18518 

.23731 

.23712 

.23709 

TSS 

.10^57 

.18826 

.23622 

.23612 

.23^82 

TST 

.10294- 

.18508 

.23733 

.23731 

.23733 

TTS 

.10052 

.18827 

.2365^ 

.23733 

.23733 

TTT 

.10306 

.18509 

.23731 

.23721 

.23733 
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V.   CONCLUSIONS  AND  REMARKS 

The  three-step  pursuer-evader  game  remains  unsolved. 
The  investigation  of  the  extended  Markovian  strategies  has 
been  shown  to  result  in  improved  evader  strategies  over  the 
Markov  Hypothesis  but  is  not  known  to  provide  a  better 
strategy  than  the  infinite  memory  strategy  of  Bram;  in  fact 
it  is  hypothesized  that  the  n-dependent  extended  Markov 
strategy  to  the  three-step  game  represents  a  finite  approxi- 
mation to  the  strategy  of  Bram.   In  this  respect  the  results 
are  not  entirely  disappointing  in  that  they  provide  a  finite 
strategy  which  appears  to  converge  rather  rapidly  to  a 
strategy  equivalent  to  Bram's  infinite  memory  strategy.   The 
five-dependent  strategy  to  the  three-step  game  relies  upon 
five  distinct  variables: 

^1     ^2    ^3    ^5    ^9 

which  provide  an  upper  bound  0.28910  which  is  reasonably 
close  to  the  bound  of  0.28903  provided  by  Bram's  infinite 
strategy.   The  near-optimal  extended  Markov  strategies 
presented  in  Tables  V,  VI,  and  VIII  represent  local  minima  to 
the  non-linear  programming  problem  discussed  in  III.B. 
While  these  can  be  seen  to  represent  improvements  from  the 
Markov  Hypothesis  strategy  they  may  not  be  the  globally 
minimum  strategies  within  the  extended  Markov  structure.   As 
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the  level  of  dependence  in  the  extended  Markov  strategies 
increases  the  mathematical  complexity  increases  dispropor- 
tionately; only  the  apparent  characteristic  of  these  ex"':ended 
Markov  strategies,  discussed  in  III.E.  makes  them  remotely 
attractive . 

It  still  remains  to  be  answered  why  the  three-step  game 
is  apparently  non-Markovian  in  its  optimal  evader  strategy 
while  "he  one  and  two-step  games  are  Markovian.   The  evader 
strategy  proposed  by  this  thesis  as  well  as  the  strategy 
described  by  Bouchoux  represent  abstractions  from  the  strict 
Markov  Hypothesis  solution  and  although  both  strategies 
represent  a  lowering  of  the  pursuer's  maximum  payoff, 
neither  is  as  tight  as  the  infinite  strategy  of  Bram  which  is 
strictly  non-Markovian  in  nature.   While  improved  finite 
strategies  may  be  possible  by  further  abstraction  from  a 
strictly  Markovian  strategy,  it  has  been  conjectured  that  no 
finite  strategy  is  optimal  for  the  evader.   This  is  known  to 
be  true  for  the  pursuer  since  he  must  observe  the  evader  for 
an  ever-increasing  length  of  time  if  he  wishes  to  achieve 
optimality  (with  the  exception  of  the  one-step  game  where 
both  sides  have  finite  optimal  strategies).   Bouchoux 
suggests  that  a  generalization  of  his  sub-Markov  strategy, 
involving  three  distinct  Markov  states  each  with  some  fixed 
probability  of  generating  a  straight  or  a  turn,  might  provide 
a  tighter  bound  on  the  game  value  due  to  its  further 
abstraction  from  a  Markov  behavior.   However,  the  mathematical 


A1 


complexity  of  locating  optimal  or  near-optimal  strategies 
within  this  framework  is  considerable. 

The  four-step  game  appears  even  more  difficult.   The 
Markov  Hypothesis  solution  is  shown  to  be  a  sub-optimal 
strategy,  being  dominated  by  the  three-dependent  extended 
Markov  strategy  of  Table  IX.   The  strategies  found  to  the 
four-step  game  in  Tables  VIII  and  IX  appear  to  preclude  an 
extension  of  Bram's  infinite  strategy  to  the  four-step  game. 
The  apparent  dissimilarity  between  the  known  near-optimal 
evader  strategies  from  the  two  to  three  to  four-step  games  is 
perplexing. 

The  discrete  evasion  game  upon  a  two  or  three  dimensional 
surface  is  another  area  which  holds  promise  for  future 
research.   The  work  of  Ferguson  solves  the  two-step  game  for 
a  special  class  of  graphs  he  calls  restricted  n-graphs; 
however  the  two-step  game  upon  more  general  two-dimensional 
surfaces,  as  well  as  the  three-step  game,  are  unsolved. 

The  discrete  pursuer-evader  game,  as  described  by  Isaacs 
in  1954..  was  generated  as  a  simplification  of  a  much  more 
complex  problem.   The  continuing  mystery  surrounding  all  but 
the  simplest  of  these  "simplified"  games  provides  a  wealth 
of  opportunity  and  motivation  for  future  research. 
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APPENDIX  A 
INVESTIGATION  OF  THE  THREE-STEP  EXTENDED  MARKOV  STRATEGY 

In  III.B.,  the  general  n-dependent  extended  Markov 
strategy  was  presented.   The  best  solution  found  for  the 
case  n=3  is  given  in  Table  V.   As  stated  earlier,  this  solu- 
tion is  not  known  to  be  optimal  but  can  be  shown  to  satisfy 
the  first-order  Kuhn-Tucker  conditions  (necessary  but  not 
sufficient)  for  a  global  minimum. 

For  the  three-dependent  case  the  problem  may  be  stated 
as  follows: 


mm 


s .  t 


MAX     {P(W=W|STATE)}] 
W,  STATE 
0.0<q.<1.0      i=1,2,3,^ 


There  are  sixteen  separate  functions  (see  Table  IV),  from 
which  the  maximum  will  be  selected  by  the  pursuer's  choice 

A 

of  W  and  STATE  (i.e.  by  his  selection  of  aim  point  and  time 
of  fire),  the  evader  must  select  the  q.'s  so  as  to  minimize 
this  maximum  payoff.   Let  f . ,  f^,  .  .  .  ,  f./  represent  the 
sixteen  functions  described  in  Table  IV,  then  the  problem 
becomes : 


mi 


n    [  MAX 

A 

q.     W, STATE 
s.t.   0.0<q.<1.0 


\I-i»  ^o»   •   •   •  *     ^  ■ 


i  =  1  ,2,3,^ 


6)] 


U 


Introducing  a  dummy  variable  q^*  the  above  non-linear 
program  may  be  equivalently  written: 


mm  q^ 

s.t.     f.  -  qc   <  0-0  J=1-16 

q.  -  1  .0  <  0.0  1  =  1-4. 

q.        >  0.0  1=1-4 

The  structure  of  this  problem  allows  some  additional 
conditions  to  be  placed  upon  the  optimal  solution; 


0.0  <  q.  <  1 .0       i=1 ,2,3,^. 
Close  inspection  of  the  functions,  f.,  show  that  if: 

«J 

q.  =  0.0  or 

p.  =  1 .0-q.  =0.0 

then  at  least  one  of  the  f.'s  will  have  a  value  of  0.0.   If 
any  f.=0.0  then  the  remaining  three  f.'s  associated  with  the 
same  initial  state  must  sum  to  1.0,  since  for  any  initial 
state : 


P(W=1 ,2,3  or  ^ISTATE)  =  1 .0 

The  minimum  of  the  maximum  of  three  non-negative  numbers 
which  sum  to  1.0  must  be  at  least  1/3,  which  is  greater  than 
the  known  upper  bound  on  the  value  of  the  game.   Therefore: 


0.0  <  q.  <  1  .0       i  =  1-ii 


klr 


Based  upon  the  above  characteristic  of  the  problem  the 
constraints ; 


q.  -  1 .0  <  0.0 


i  =  1-^ 


will  not  be  binding  at  the  optimal  solution  and  may  be 
dropped  without  consequence,  resulting  in: 


mm 


■5 


s.t.     f.  -q.  <0.0     J=1-16 
q.       >  0.0     i=1-5 


The  first-order  Kuhn-Tucker  conditions  for  the  above  problem 
require  that,  at  an  optimal  point,  there  exist  a  set  of  X's 
such  that: 


1^  >  0-0 
3q.- 


q.  TT—   =0.0 
^1  da  . 
^1 


q.  >_   0.0 
i  =  1-5 


—  >  0  0 


A.  1^   =  0.0 
J  3A. 


A.  <  0.0 


j  =  1-16 


where : 


L(q,A)  =  q^  -  ZA.(f.  -  q^) 
These  conditions  may  be  further  modified: 


A5 


3L 
3qi 


=  0.0 


q.  1^   =  0.0 


q.  >  0.0 


i  =  1-5 

In  the  proposed  near-optimal  solution  in  Table  V,  seven 

of  the  sixteen  inequality  constraints  are  binding;  that  is: 

^4  =  ^6  =  ^7  =  ^8  =  fu  =  ^15  =  ^16  =  ^5  =  °-2896i 

the  remaining  nine  constraints  are  slack,  it  follows  that: 
X^  =  X2  =  X^  =  A^  =  A(^  =  A^Q  =  A^  ^  =  A^2  ~  ^-^o^    ~    ^''^ 

The  proposed  solution  must  therefore  satisfy  the  following 
conditions: 


3L 
3qi 


=  0.0 


i  =  1,5 


A.  <  0.0 
J  - 


j  =  4, 6, 7, 8, U, 15, 16 


with  the  substitution  of  the  values. 


q^  =  0.66163    q2  =  0.70054    qo  =  0.62^89    q,  =  0.70054 

3L 
the  five  constraints  (-r—   =  0.0),  become  a  set  of  five 

da. 
^1 

linear  equations  in  seven  unknowns  (A,,  A,,  A„,  Ao»  A.,,  A. ^ , 

4    D    /    o    14    \  z> 

Aw).   Any  solution  to  this  set  of  equations  which  also 
satisfies  the  condition: 
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X.    <    0.0 

J  - 


J=^,6.7,8,U,15,16 


will  satisfy  the  modified  Kuhn-Tucker  conditions.   Using 
linear  programing  methods,  such  a  set  of  X's  was  found, 
thereby  verifying  the  satisfaction  of  the  Kuhn-Tucker  condi' 
tions  at  the  proposed  three-dependent  strategy  of  Table  V. 
The  near-optimal  solutions  to  the  four  and  five-dependent 
strategies  (Tables  VI  and  VII)  could  be  analyzed  in  a 
similar  manner. 
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